1 Introduction

2 Section 1

2.1 How many bags of coffee from each country were sampled?

We begin our analysis by providing a summary of the samples used to provide the coffee ratings. For each country different varieties of coffee were sampled from different regions and companies. The samples from each country can be summarised using the number_of_bags variable. It is important to know the total quantity for each country to possibly identify a relationship between the number of samples used and the countries rating.

The below table summarises the total bags counted for each country.

We observe that more than 40000 samples of Colombian coffee were used for grading.

Scatterplot of average coffee rating and number of testing samples

Figure 2.1: Scatterplot of average coffee rating and number of testing samples

Figure 2.1 indicates that there is not an apparent association between the average coffee rating and the number of testing samples.

2.2 Which country produces the highest rated coffee?

After we identified that the number of testing samples doesn’t influence the coffee rating, we can answer which countries produce the best quality of coffee.

Coffee Rating Distribution by Country

Figure 2.2: Coffee Rating Distribution by Country

From Figure 2.2 we notice that the distribution of Ethiopian coffee rating is highly skewed to the right indicating that the coffee quality is excellent.

3 Section 2

In Section 3 I want to analyze some interesting content about Arabica coffee beans and Robusta coffee beans.

3.1 Which top3 countries cultivated most kinds of Arabica coffee beans and Robusta coffee beans respectively?

The Table 3.1 shows that Mexico, Colombia and Guatemala cultivated the most kinds of Arabica coffee beans and India, Uganda and Ecuador cultivated the most kinds of Robusta coffee beans. Also it could find that there are much more types of Arabica coffee beans compared to Robusta coffee beans, which conforms to the content given by Bunn et al. (2015).

Table 3.1: The top3 countries which cultivated most kinds of Arabica coffee beans and Robusta coffee beans respectively
country_of_origin species n
Mexico Arabica 236
Colombia Arabica 183
Guatemala Arabica 181
India Robusta 13
Uganda Robusta 10
Ecuador Robusta 2

Now, the Figure 3.1 shows the geographical location of this 6 countries. It could easily find that it seems to be an obvious coffee production zone, which is between the equator and 30 degrees north latitude. In these zones, the annual average temperature and rainfall are in line with the coffee bean growing conditions.

Besides, it indicates that the countries that cultivated Arabica coffee beans are all located in Central and South America, while the countries that cultivated Robusta coffee beans are located in several continents like South America, Eastern Africa and India. It is related to the environment required for the growth of different coffee beans.

The geographical location of the top3 countries which cultivated most kinds of Arabica coffee beans and Robusta coffee beans respectively

Figure 3.1: The geographical location of the top3 countries which cultivated most kinds of Arabica coffee beans and Robusta coffee beans respectively

3.2 What is the difference of altitude of Arabica coffee beans and Robusta coffee beans production areas?

Figure 3.2 indicates that the mean altitude of Arabica coffee beans production areas is concentrated from 1000 to 1800 meters. Besides, people could surprisingly find that there exists two peaks about the mean altitude of Robusta coffee Beans production areas, and the ranges are concentrated from 500 to 1600 meters and 2800 to 3400 meters respectively. However, the probability of the second peak is much less than the first one.

So it could say that the mean altitude of Arabica coffee beans production areas is higher than that of many Robusta coffee Beans production areas even though there are some exceptions.

The mean altitude of Arabica coffee beans and Robusta coffee beans production areas

Figure 3.2: The mean altitude of Arabica coffee beans and Robusta coffee beans production areas

3.3 In which species Arabica coffee beans or Robusta coffee beans has higher grades?

Figure 3.3 shows the scores of several primary different aspects of Arabica coffee beans and Robusta coffee beans and their total points. It is obvious that in acidity, aftertaste, aroma and flavor aspects, the median score of Robusta coffee beans is higher than that of Arabica coffee beans, which means Robusta coffee beans have a better performance than Arabica coffee beans. As to sweetness, Arabica coffee beans is much better than Robusta coffee beans. In the end, total point, which combines these primary aspects and some other aspects, shows that Arabica coffee beans has a better quality. Maybe these grades would give some help when people choose coffee beans.

Scores of several different aspects of Arabica coffee beans and Robusta coffee beans and their total points

Figure 3.3: Scores of several different aspects of Arabica coffee beans and Robusta coffee beans and their total points

4 Section 3

4.1 Which processing method leads to better rating

Figure 4.1: Distribution of Coffee Ratings based on Processing method

In Figure 4.1 the ratings for Semi-washed/Semi-pulped and Pulped natural honey is better, as the average rating does not go below 8 for them. Pulped Natural honey process allows the coffee beans to be dried after removing the skin of the fruit when all the is still in the beans.It’s essentially a middle ground between the dry and wet processing methods. During the natural (or dry) method, the beans are dried entirely in their natural form, while the washed (or wet) process sees all of the soft fruit residue, both skin and pulp, removed before the coffee is dried (Costa, 2020). This can also be deduced from the graph above where the ratings for the Washed/Wet processing method has the least rating and suggests that it is not one of the best processing methods.

4.2 Which harvest year produced the best coffee?

Coffee Ratings in each harvest year

Figure 4.2: Coffee Ratings in each harvest year

It can be observed from Figure 4.2 that the data available is for the years 2010 - 2018. Among this, 2012 has the best ratings for the harvest. 2018 had the least ratings. This is due to the favorable weather conditions in almost all the countries that produce coffee (Huong & Quan, 2012).

Table 4.1: Number of records for each year
harvest_year count
2012 354
2014 233
2013 181
2015 129
2016 124
2017 70
2011 26
2010 10
2018 1

As it can be seen in Table 4.1, there is only one record for 2018. Which implies that there is some missing data in the data set.

5 Conclusion

6 References

Costa, B. (2020, July 24). Coffee Processing: Understanding Pulped Natural Coffee. Perfect Daily Grind. https://perfectdailygrind.com/2016/06/coffee-processing-understanding-pulped-natural-coffee/.

Huong, T., Quan, Q. (2012, May 24). 2012 Coffee Annual. Published by Global Agriculture Information Network.

Bunn, Christian, Peter Läderach, Oriana Ovalle Rivera, and Dieter Kirschke. 2015. “A Bitter Cup: Climate Change Profile of Global Production of Arabica and Robusta Coffee.” Climatic Change 129 (1): 89–101.